Automated Word Stress Detection in Russian

نویسندگان

  • Maria Ponomareva
  • Kirill Milintsevich
  • Ekaterina Chernyak
  • Anatoly Starostin
چکیده

In this study we address the problem of automated word stress detection in Russian using character level models and no partspeech-taggers. We use a simple bidirectional RNN with LSTM nodes and achieve the accuracy of 90% or higher. We experiment with two training datasets and show that using the data from an annotated corpus is much more efficient than using a dictionary, since it allows us to take into account word frequencies and the morphological context of the word.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Detection of Non-Relevant Posts on the Russian Imageboard "2ch": Importance of the Choice of Word Representations

This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word e...

متن کامل

Automated WordNet Construction Using Word Embeddings

We present a fully unsupervised method for automated construction of WordNets based upon recent advances in distributional representations of sentences and word-senses combined with readily available machine translation tools. The approach requires very few linguistic resources and is thus extensible to multiple target languages. To evaluate our method we construct two 600-word test sets for wo...

متن کامل

Automatic word stress annotation of Russian unrestricted text

We evaluate the effectiveness of finitestate tools we developed for automatically annotating word stress in Russian unrestricted text. This task is relevant for computer-assisted language learning and text-to-speech. To our knowledge, this is the first study to empirically evaluate the results of this task. Given an adequate lexicon with specified stress, the primary obstacle for correct stress...

متن کامل

The acoustic characteristics of Russian vowels in children of 6 and 7 years of age

The purpose of this investigation is to examine the process of acoustic features of vowels from child speech approaching corresponding values in the normal Russian adult speech. The vowels formants structure, pitch and vowels duration were examined. Word stress and palatal context influence on the formants structure of the vowels were taken into account. It was shown that the word stress is for...

متن کامل

Design and Data Collection for the Accentological Corpus of the Russian Language

Accentological corpus provides a researcher an opportunity to study word stress and stress variation, which are very important for the Russian language. Moreover, Accentological corpus allows studying the history of the Russian language stress development. The research presents the main characteristics of Accentological corpus available at ruscorpora.ru. Corpora size, type and sources of text m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017